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Challenges and some new directions in channel 

coding 

Erdal Ankan, Najeeb ul Hassan, Michael Lentmaier, Guido Montorsi and lossy Sayir 


Abstract: Three areas of ongoing research in channel coding are 
surveyed, and recent developments are presented in each area: 
spatially conpled Lovr-Density Parity-Check (LDPC) codes, non- 
hinary LDPC codes, and polar coding. 

Index Terms: LDPC codes, spatial conpling, non-binary codes, po¬ 
lar codes, channel polarization. 

I. Introduction 

The history of channel coding began hand in hand with Shan¬ 
non’s information theory Q. Following on the pioneering work 
of Golay 0 and Hamming Q, the majority of linear codes de¬ 
veloped in the early ages of coding theory were “error correc¬ 
tion” codes in the sense that their aim is to correct errors made 
by the channel. The channel was universally assumed to be a Bi¬ 
nary Symmetric Channel (BSC). The study of error correction 
codes culminated with the invention of Reed-Solomon codes 0 
in 1960, which are Maximum Distance Separable (MDS) over 
non-binary fields and hence are guaranteed to correct or detect 
the largest number of errors possible for a given code length and 
dimension. 

In parallel to the evolution of linear block codes, the inven¬ 
tion of convolutional codes by Peter Elias in 1955 0 lead to a 
different approach and to the invention of trellis-based decoding 
methods such as the Viterbi algorithm Q, |[7) and the BCJR al¬ 
gorithm 0. Both of these algorithms can be easily adapted to 
any channel and hence generalise the concept of error correction 
to general channels that cannot be described simply in terms of 
probability of error. We now speak of “channel coding” rather 
than “error correction coding”. Further progress in channel cod¬ 
ing was made by Gottfried Ungerboeck 0 by linking coding to 
modulation for convolutional codes. 

In 1993, Claude Berrou and co-authors shocked the coding re¬ 
search community in fTO) by designing a coding system known 
as “turbo codes” that achieved a quantum leap in the perfor¬ 
mance of codes over general channels. They obtained very good 
error performance within a small margin of the channel capac¬ 
ity, something that had been thought impossible with practical 
systems and moderate complexity by most coding theorists. Yet 
Berrou’s approach achieved this in an eminently implementable 
system and with linear decoding complexity. In the subse¬ 
quent scramble to explain the theory behind this puzzling perfor¬ 
mance, a method originally developed by Robert Gallager in his 
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PhD thesis known as Low-Density Parity-Check (LDPC) 
coding was rediscovered in and shown to have comparable 
properties. Both these methods have become the workhorses 
of modern communication standards, with arguments about the 
technical advantages of one over the other mostly obscured by 
business and standardization interests of the argumenter. What 
is clear and undisputed is that LDPC codes are easier to explain 
and analyse and hence should probably take precedence over 
turbo codes in teaching. It is nowadays well-known that both 
LDPC codes and turbo codes can be viewed as sparse codes on 
graphs. As a consequence they share a lot of properties, and 
any construction or analysis method that can be applied to one 
of them can usually be replicated for the other. Some technical 
differences between LDPC or turbo codes may tilt the balance 
towards one or the other in specific applications. 

We could conclude this history of coding here and bury the 
topic into dusty textbooks, sending it the same way as classi¬ 
cal Newtonian mechanic^] and other topics made obsolete by 
quantum leaps in research. Many coding researchers nowadays 
are confronted with the recurrent “Coding is dead” motto HD 
of experts claiming that, now that capacity is achieved, there is 
nothing further to be researched in the field. In fact, as this pa¬ 
per will contribute to showing, coding is still an ongoing and 
very active topic of research with advances and innovations to 
address important and practical unsolved problems. 

Current hurdles in the applicability of modern coding tech¬ 
niques can be classified in two categories; 

Complexity While turbo and LDPC codes have brought 
capacity-approaching performance within reach of imple¬ 
mentable systems, implementable does not necessarily mean 
practical. The complexity of codes that perform well under prac¬ 
tical constraints such as limited decoding delay and high spec¬ 
tral efficiency is still a major hurdle for low power implemen¬ 
tations in integrated circuits. There is a serious need for new 
methods that simplify code design, construction, storage, and 
decoder implementation. 

New applications Turbo and LDPC codes can be seen to 
“solve” the capacity problem for elementary point-to-point 
channels. Recent years have seen advances in information the¬ 
ory for many multi-user channels such as the multiple access, 
broadcast, relay and interference channels. As communication 
standards become more ambitious in exploiting the available 
physical resources such as spectrum and geographical reach, 
there is a push to switch from interference limited parallel point- 
to-point protocols to true multi-user processing with joint en¬ 
coding and/or decoding. There is a need for coding methods 

^Apologies to mechanics researchers for the seemingly disparaging remark. 
In fact, we are aware that classical mechanics is an ongoing and modern research 
topic as evidenced by many journals and conferences, just as coding theory is. 
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that can do this efficiently for all of the scenarios described. Fur¬ 
thermore, theory has gone further than pure communications by 
expanding to distributed compression and joint source/channel 
coding, distributed storage, network coding, and quantum chan¬ 
nels and protocols. All of these new theories come with their 
own requirements and constraints for coding, and hence coding 
research is far from dead when it comes to these new applica¬ 
tions. 

The paper will present three areas of ongoing research in coding, 
all of which have some degree of relevance to the two challenges 
described. 

In Section|I^ we will address spatially coupled LDPC codes, 
which have a structure akin convolutional codes. For spatially 
coupled codes the asymptotic performance of an iterative de¬ 
coder is improved to that of an optimal decoder, which opens 
the way for new degrees of freedom in the code design. For ex¬ 
ample, it is possible to achieve capacity universally for a large 
class of channels with simple regular SC-LDPC codes where ir¬ 
regular LDPC codes would require careful individual optimiza¬ 
tions of their degree profiles. We will discuss the design of SC- 
LDPC codes for flexible rates, efficient window decoding tech¬ 
niques for reduced complexity and latency, and the robustness of 
their decoding for mobile radio channels. In Section|I^ we will 
address non-binary LDPC and related codes. These are codes 
over higher order alphabets that can, for example, be mapped 
directly onto a modulation alphabet, making them interesting 
for high spectral efficiency applications. While these have been 
known for a while, the complexity of decoding has made them 
unsuited for most practical applications. In this section, we will 
discuss research advances in low complexity decoding and also 
present a class of LDPC codes with an associated novel de¬ 
coding algorithm known as Analog Digital Belief Propagation 
(ADBP) whose complexity does not increase with alphabet size 
and hence constitutes a promising development for very high 
spectral efficiency communications. Finally, in Section [TV] we 
will introduce Polar coding, a new technique introduced in | [T4) 
based on a phenomenon known as channel polarization, that has 
the flexibility and versatility to be an interesting contender for 
many novel application scenarios. 

II. Spatially Coupled LDPC Codes 

The roots of low-density parity-check (LDPC) codes 
trace back to the concept of random coding. It can be shown that 
a randomly generated code decoded with an optimal decoder ex¬ 
hibits very good performance with high probability. However, 
such a decoder is infeasible in practice because the complexity 
will increase exponentially with the code length. The ground¬ 
breaking idea of Gallager was to slightly change the random en¬ 
semble in such a way that the codes can be decoded efficiently 
by an iterative algorithm, now known as belief propagation (BP) 
decoding. His LDPC codes were defined by sparse parity-check 
matrices H that contained a fixed number of K and J non-zero 
values in every row and column, respectively, known as regu¬ 
lar LDPC codes. Gallager was able to show that the minimum 
distance of typical codes of the ensemble grows linearly with 
the block length, which guarantees that very strong codes can 
be constructed if large blocks are allowed. The complexity per 
decoded bit, on the other hand, is independent of the length if 



Fig. 1. Illustration of edge spreading: the protograph of a (3,6)-regular block 
code represented by a base matrix B is repeated L = 6 times and the edges are 
spread over time according to the component base matrices Bo, Bi, and B 2 , 
resulting in a terminated LDPCC code. 


the number of decoding iterations is fixed. 

The asymptotic performance of an iterative decoder can be 
analyzed by tracking the probability distributions of messages 
that are exchanged between nodes in the Tanner graph (den¬ 
sity evolution) 0. The worst channel parameter for which the 
decoding error probability converges to zero is called the BP 
threshold. The BP thresholds of turbo codes are actually bet¬ 
ter than those of the original regular LDPC codes of Gallager. 
A better BP threshold is obtained by allowing the nodes in the 
Tanner graph to have different degrees 0. By optimizing the 
degrees of the resulting irregular LDPC code ensembles it is 
possible to push the BP thresholds towards capacity. However, 
this requires a large fraction of low-degree variable nodes, which 
leads to higher error floors at large SNRs. As a consequence of 
the degree optimization, the capacity achieving sequences of ir¬ 
regular LDPC codes do no longer show a linear growth of the 
minimum distance. 

LDPC convolutional codes were invented by Jimenez Felt- 
strom and Zigangirov in 1161. Like LDPC block codes, they are 
defined by sparse parity-check matrices, which can be infinite 
but have a band-diagonal structure like the generator matrices 
of classical convolutional codes. When the parity-check matrix 
is composed of individual permutation matrices, the structure 
of an LDPC code ensemble can be described by a protograph 
| [T7| (a prototype graph) and its corresponding base matrix B. 
The graph of an LDPC convolutional code can be obtained by 
starting from a sequence of L independent protographs of an 
LDPC block code, which are then interconnected by spreading 
the edges over blocks of different time instants 0. The maxi¬ 
mum width of this edge spreading determines the memory, rricc, 
of the resulting chain of length L that defines the LDPC convo¬ 
lutional code ensemble. Since the blocks of the original proto¬ 
graph codes are coupled together by this procedure, LDPC con¬ 
volutional codes are also called spatially coupled LDPC codes 
(SC-LDPC). Figure [2 shows an illustration of the edge spread¬ 
ing procedure. 

A BP threshold analysis of LDPC convolutional codes shows 
that the performance of the iterative decoder is improved sig¬ 
nificantly by spatial coupling. In fact, the results in p9) show 
that asymptotically, as L tends to infinity, the BP threshold is 
boosted to that of the optimal maximum a posteriori (MAP) de¬ 
coder. Stimulated by these findings, Kudekar, Richardson and 
Urbanke developed an analytical proof of this threshold satura- 
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tion phenomenon pO|pT|. More recently, potential functions 
have been identified as a powerful tool for characterizing the 
connection between MAP thresholds and BP thresholds 


All these approaches make use of the area theorem 1231 in order 
to derive bounds on the MAP threshold and prove threshold sat¬ 
uration for spatially coupled codes. Since the MAP thresholds 
of regular LDPC ensembles with increasing node degrees are 
known to converge to capacity, it follows that spatial coupling 
provides a new way of provably achieving capacity with low- 
complexity iterative BP decoding — not only for the BEC but 
also for the AWGN channel. Furthermore, the spatially coupled 
code ensembles inherit from the uncoupled counterparts, the lin¬ 
early increasing minimum distance property This combi¬ 
nation of capacity achieving thresholds with low complexity de¬ 
coding and linearly increasing distance is quite unique and has 
attracted a lot of interest in the research community. 

The capacity achieving property of regular SC-LDPC codes 
raises the question whether irregularity is still needed at all. In 
principle, it is possible for any arbitrary rational rate to construct 
regular codes that guarantee a vanishing gap to capacity with BP 
decoding. On the other hand, for some specific code rates, the 
required node degrees and hence the decoding complexity in¬ 
crease drastically. But even if we neglect the complexity, there 
exists another problem of practical significance that so far has 
not received much attention in the literature; for large node de¬ 
grees J and K the threshold saturation effect will only occur for 
larger values of the coupling parameter nice, as illustrated for 
the BEC in Fig. We can see that for a given coupling 

width w = nice + 1, the gap to capacity becomes small only for 
certain code rates R, and it turns out that these rates correspond 
to the ensembles for which the variable node degree J is small. 

Motivated by this observation, in | |25] some nearly-regular 
SC-LDPC code ensembles where introduced, which are built 
upon the mixture of two favorable regular codes of same vari¬ 
able node degree. The key is to allow for a slight irregularity 
in the code graph to add a degree of freedom that can be used 
for supporting arbitrary rational rates as accurately as needed 
while keeping the check and variable degrees as low as possible. 
These codes exhibit performance close to the Shannon limit for 
all rates in the rate interval considered, while having a decoder 
complexity as low as for the best regular codes. The exclusion of 
variable nodes of degree two in the construction ensures that the 
minimum distance of the proposed ensembles increases linearly 
with the block length, i.e., the codes are asymptotically good. 


A. Efficient Decoding of Spatially Coupled Codes 

In order to achieve the MAP threshold, the number L of cou¬ 
pled code blocks should be sufficiently large for reducing the 
rate loss due to termination of the chain. But running the BP de¬ 
coder over the complete chain of length L would then result in a 
large latency and decoding complexity and hence is not feasible 
in practical scenarios. However, thanks to the limited width of 
the non-zero region around the diagonal, SC-LDPC codes can 
be decoded in a continuous fashion using a sliding window de¬ 
coder | [26| of size W (W <C L). As a result, decoding latency 
and decoding complexity become independent of L. Moreover, 
the storage requirements for the decoder are reduced by a factor 
of L/W compared to a non-windowed decoder. An example of 



Fig. 2. Density evolution thresholds for (J, iti)-regular SC-LDPC 
ensembles in comparison with the Shannon limit e®*'. The coupling width w is 
equal to rricc + 1- For a given rate R = 1 — J/K, the smallest pair of values 
J and K are chosen under the condition that J > 3. The ensembles with 
minimum variable node degree J = 3 are highlighted with squares. 
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Fig. 3. Window decoder of size W = 4 at time t. The green variable nodes 
represent decoded blocks and the red variable nodes (yt) are the target block 
within the cuiTent window. The dashed lines represent the read access to the 
nice previously decoded blocks. 


the window decoder of size W = 4 is given in Fig.[^ 

It has been shown in 1271 that for equal structural latency, SC- 
LDPC codes under window decoding outperform LDPC codes 
for short to long latency values and outperform convolutional 
codes from medium to long latency values. For applications re¬ 
quiring very short latency, Viterbi decoded convolutional codes 
were still found to be the optimal choice |^|^|27|. Note 
that only structural latency was considered in all these compar¬ 
isons which is defined as the number of bits required before 
decoding can start. It therefore can be concluded that for low 
transmission rate applications (in the range of bit/seconds), con¬ 
volutional codes with moderate constraint length are favorable 
since the delay in filling the decoder buffer dominates the over¬ 
all latency. Whereas, for applications with transmission rates in 
excess of several Gigabit/seconds, e.g., short range communica¬ 
tion, medium to large structural latency is tolerable and strong 
codes such as SC-LDPC codes provide gain in performance 
compared to the conventional convolutional codes. Another ad¬ 
vantage of using a window decoder is the flexibility in terms of 
decoding latency at the decoder. Since the window size W is a 
decoder parameter, it can be varied without changing the code, 
providing a flexible trade-off between performance and latency 
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Fig. 4. Illustration of block-fading channel for a codeword of length N and 

F = 2. 


In BP decoding, messages are passed between the check and 
variable nodes according to a parallel (flooding) or serial (on- 
demand) rule pO) . In both schedules, all the nodes in the graph 
are typically updated at every decoding iteration (uniform sched¬ 
ules). For both LDPC and SC-LDPC, a uniform serial decoding 
schedule results in a factor of two in complexity reduction when 
applied over the complete length of the code pO] , However, this 
gain in complexity reduction reduces to only 20% when uniform 
serial schedules are applied within a decoding window pT|p2). 
In order to reduce the decoding complexity for window decod¬ 
ing, non-uniform window decoding schedules has been intro¬ 
duced in 1^1^, which result in 50% reduction in complex¬ 
ity compared to uniform decoding schedules. The reduction in 
decoding complexity can be achieved by avoiding unnecessary 
updates of nodes not directly connected to the first position in 
the window. Only nodes that show improvement based on their 
BER compared to the previous iteration are updated in the next 
iteration. 


B. Performance over Mobile Radio Channels 

One of the most remarkable features of spatially coupled 
codes is their universality property, which means that a single 
code construction performs well for a large variety of channel 
conditions. For discrete-input memory less symmetric channels 
the universality of SC-LDPC codes has been shown in pT) . In 
this section we consider the block-fading channel and demon¬ 
strate that SC-LDPC codes show a remarkable performance on 
this class of channels. 


The block-fading channel was introduced in 1331 to model 
the mobile-radio environment. This model is useful because the 
channel coherence time in many cases is much longer than one 
symbol duration and several symbols are affected by the same 
fading coefficient. The coded information is transmitted over a 
finite number of fading blocks to provide diversity. An example 
where a codeword of length N spreads across F = 2 fading 
realizations is shown in Fig. In general, when dealing with 
block-fading channels, two strategies can be adopted; coding 
with block interleaving or coding with memory p^ . Spatially- 
coupled codes, with their convolutional structure among LDPC 
codes, are expected to be a nice example of the second strategy. 

The block-fading channel is characterized by an outage prob¬ 
ability, which serves as a lower bound on the word error prob¬ 
ability for any code decoded using a maximum likelihood de¬ 
coder. In terms of density evolution, the density evolution out¬ 
age (DEO) is the event when the bit error probability does not 
converge to zero for a fixed value of SNR after a finite or an 
infinite number of decoding iterations are performed p5) . The 
probability of density evolution outage, for a fixed value of SNR, 
can then be calculated using a Monte Carlo method considering 



Fig. 5. Density evolution outage for SC-LDPC codes with memory 0,1,2 and 
3. The bold lines represent the DEO and dashed lines represent the simulation 
results when a code with N = 200, L = 100, is decoded using a window 
decoder, F = 2. 


significant number of fading coefficients. 

Since the memory of the code plays an important role to ex¬ 
ploit code diversity, we consider SC-LDPC codes with increas¬ 
ing memory from 0 to 3. The diversity of the code, which is 
defined as the slope of the WER curve, is calculated numeri¬ 
cally from the DEO curves presented in Fig. For uncoupled 
LDPC codes, the diversity is limited to d = 1.3 (see dotted line 
in Fig. 1^. This case can be interpreted as an SC-LDPC code 
with nice = 0. If we now increase the coupling parameter to 1, 
2 and 3, then the diversity of SC-LDPC codes increases to 3, 6 
and 10, respectively |36|. The figure also shows the simulation 
results (dashed lines) for finite length codes when the length of 
each individual coupled code block is iV = 200. The simulation 
results match closely with the calculated DEO bounds. 

An alternative approach to codes with memory is taken by 
the root-LDPC codes p5) with a special check node structure 
called rootcheck. Full diversity (d = F = 1/i?) is provided 
to the systematic information bits only by connecting only one 
information bit to every rootcheck. However, designing root- 
LDPC codes with diversity order greater than 2 requires codes 
with rate less than R = 1/2. The special structure of the codes 
makes it a complicated task to generate good root-LDPC codes 
with high diversity (and thus low rate). 

Another key feature of SC-LDPC codes is its robustness 
against the variation in the channel. In case of root-LDPC codes, 
the parity-check matrix has to be designed for the specific chan¬ 
nel parameter F to provide a diversity of d = F to the infor¬ 
mation bits. However for SC-LDPC codes, it can be shown that 
the code design for a specific value of F is not required whereas 
the diversity order strongly depends on the memory of the code. 
This feature makes them very suitable for a wireless mobile en¬ 
vironment. 


III. Non-Binary Codes and High Spectral Efficiency Codes 

Low-Density Parity-Check (LDPC) codes were originally 
proposed by Gallager O) and re-discovered by MacKay & 
al. in the years after the invention of turbo codes pO). 
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LDPC codes have been adopted in several current standards, 
e.g., T FP, K 802.1 In Wi-Fi standard, DVB-S2, T2, and C2 digital 
video broadcasting satellite, cable and terrestrian, lOGBase-T 
ethernet over twisted pairs, G.hn/G.9960 home networking over 
power lines. Together with turbo codes, they are the modern 
coding technique of choice when it comes to designing commu¬ 
nication systems that approach the theoretical limits of physical 
transmission media in terms of data rate, transmission power, 
geographical reach and reliability. 

All LDPC codes in current standards are binary codes. LDPC 
codes over non-binary alphabets were mentioned in im and 
fully described in p7) . They offer two practical advantages and 
one major disadvantage with respect to binary codes: 

• Advantage 1: encoding directly over the g-ary alphabet cor¬ 
responding to the signal constellation used for modulation saves 
the mapping and de-mapping operations needed to transfer be¬ 
tween binary coding alphabet and non-binary modulation signal 
space. Furthermore, the de-mapping operation is costly in terms 
of complexity and introduces a loss of sufficient statistic and a 
resulting performance loss that can only be partially countered 
by proper choice of the mapping, or fully recovered by costly 
iterations over the de-mapper and the decoder. With non-binary 
codes, there is no mapping and no loss of efficiency through de¬ 
mapping as the input messages to the decoder are a sufficient 
statistic for the transmitted symbols, making non-binary LDPC 
codes a tempting proposition for high spectral efficiency coding 
over higher order constellations. 

• Advantage 2: non-binary LDPC codes tend to exhibit less of a 
performance loss when the block length is shortened to accom¬ 
modate delay constraints, as compared to binary codes. 

• Disadvantage: the decoding complexity of LDPC codes in¬ 
creases with the alphabet size. 

The complexity issue has been addressed in a number of re¬ 
finements of the non-binary LDPC iterative decoding algorithm. 
The plain description of the decoder requires convolutions of q- 
ary distribution-valued messages in every constraint node of the 
associated factor graph. A first and appealing improvement pT) 
is obtained by switching to the frequency domain where convo¬ 
lutions become multiplications. This involves taking the q point 
discrete Fourier transform (DFT) if g is a prime number, or, for 
the more practical case where g is a power of two g = 2™, tak¬ 
ing the g point Walsh-Hadamard transform (WHT). This step 
reduces the constraint node complexity from g^ to glogg by 
evaluating the appropriate transform in its “fast” butterfly-based 
implementation, i.e.. Fast Fourier transform (FFT) for the DFT 
and Fast Hadamard transform (FHT) for the WHT. 

While this first improvement is significant, the resulting com¬ 
plexity is still much higher than that of the equivalent binary 
decoder. The currently least complex methods known for de¬ 
coding non-binary LDPC codes are various realizations of the 
Extended Min-Sum (EMS) p8) algorithm. In this method, con¬ 
volutions are evaluated directly in the time domain but messages 
are first truncated to their most significant components, and con¬ 
volutions are evaluated on the truncated alphabets, resulting in 
a significant complexity reduction with respect to the g^ opera¬ 
tions needed for a full convolution. While the principle of the 
algorithm is easy enough to describe as we just did, in fact its 
implementation is quite subtle because of the need to remem¬ 


ber which symbols are retained in the truncated alphabet for 
each message and which configurations of input symbols map 
to which output symbols in a convolution. Many technical im¬ 
provements of the EMS can be achieved by hardware-aware im¬ 
plementation of the convolution operations, e.g., ig, ig. 

In this section, we discuss two current research areas related 
to non-binary codes. Eirst, we will look at frequency-domain 
methods that operate on truncated messages. The aim here is 
to achieve a fairer comparison of complexity between the EMS 
and frequency-domain methods, since much of the gain of the 
EMS is achieved through message truncation, but in complex¬ 
ity comparisons it is evaluated alongside frequency domain de¬ 
coders operating on full message sets. In the second part of this 
section, we will look at a novel non-binary code construction op¬ 
erating over rings rather than fields, with a decoding algorithm 
known as Analog Digital Belief Proapagation (APBP) pTj . This 
promising new approach has the merit that its complexity does 
not increase with the alphabet size, in contrast to regular be¬ 
lief propagation for LDPC codes over g-ary fields, making it an 
appealing proposition for very high spectral efficiency commu¬ 
nications. 

A. Frequency domain decoding with truncated messages 

The ideal constraint node operation of an LPDC decoder op¬ 
erating on a field if implements a Bayesian estimator for the 
conceptual scenario illustrated in Eigure The estimator pro- 



Fig. 6. Conceptual scenario for a degree 4 constraint node decoder 


vides the a-posteriori probability distribution of code symbol 
Xi given the observations 12,^3 and Y 4 of the code symbols 
X 2 ,X^ and X 4 , respectively, where the sum of Xi,X 2 ,X 3 
and X 4 is zero over X. Assuming that the input to the de¬ 
coder is provided in terms of a-posteriori probability distribu¬ 
tions Px 2 \Y 2 ^ and Px^iY^, i-e., as distribution-valued 

messages, it follows that the distribution TjCilYaLiW com¬ 
puted is a type of convolution of the input distributions. Eor 
example, if = GF(3) , i.e., the field of numbers {0,1,2} 
using arithmetic modulo 3, then the output probability that Xi 
be zero given Y 2 , Y 3 and Y 4 is the sum of the probabilities all 
configurations of X 2 ,X 3 and X 4 that sum to zero, i.e., 0,0,0 
or 0,1,2 or 0,2,1 or 1,0,2 or 1,1,1 or 1,2,0 or 2,0,1 or 2,1,0 or 
2,2,2. This case results in a cyclic convolution of the three 
distribution-valued input messages. Over the more commonly 
used binary extension fields GF(2'"), where the sum is defined 
as a bitwise sum, the corresponding operation is a component¬ 
wise cyclic convolution in multi-dimensional binary space. 

Convolution can be efficiently operated in the frequency do¬ 
main. For a pure cyclic convolution such as the one illustrated 
over GF(3), the transform required is the discrete Fourier trans- 
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form (DFT). The convolution of vectors in the time domain is 
equivalent to the componentwise product of the corresponding 
vectors in the transform domain. This process is illustrated in 
Figure For the more practically relevant binary extension 
fields GF(2"‘), the same process applies but the transform re¬ 
quired is the Walsh-Hadamard transform (WHT). 
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Fig. 8. High spectrally efficient systems using binary codes and pragmatic 
receiver (A), non binary codes and non binary BP (B), and ADBP (C). 


Fig. 7. Frequency domain convolution 

Both the DFT and the WHT can be operated efficiently us¬ 
ing a fast butterfly structure as the Fast Fourier transform (FFT) 
or the Fast Hadamard Transform (FHT), requiring qlogq op¬ 
erations where q is the alphabet size of the code. In a typical 
non-binary LDPC decoder realization, these transforms despite 
their efficient implementation still use up over 90% of the com¬ 
puting resources and hence constitute the main hurdle for the 
practical implementability of non-binary LDPC when compared 
to binary LDPC codes. The approach of the EMS is to revert to 
time-domain convolutions but operate them on reduced alphabet 
sizes q' qhy truncating each incoming distribution-valued 
message to its largest components. The resulting algorithm is 
more difficult to operate than may at first appear, because in 
such partial convolutions one needs to retain which output val¬ 
ues emerge from the mappings of the differing truncated alpha¬ 
bets of each input message, so the implementation needs to per¬ 
form operations in in parallel to the convolution operations 
over the probabilities. The complexity comparison becomes a 
comparison between and q log q. For example, when oper¬ 
ating in GF(64), the complexity of the frequency domain based 
decoder is on the order of 6 x 64 = 384 operations per constraint 
node per iteration, whereas the EMS with messages truncated to 
g' = 8 is in the order of 8 x 8 = 64 operations per constraint 
node per iteration. An added benefit of performing convolutions 
in the time domain is that one can operate in the logarithmic 
domain, replacing products by max operations using the well 
established approach that also underpins the min-sum method 
for decoding binary LDPC codes. 

The comparison described above is not completely fair be¬ 
cause it fails to take into account that message truncation may 
also be of benefit when operating in the frequency domain. 
Specifically, evaluating a EHT for truncated messages can be 
made more efficient if we neutralise all operations that apply to 


the constant message tail corresponding to the truncated portion 
of the message. In p2) , the expected number of operations in a 
EHT on truncated messages was evaluated both exactly and us¬ 
ing an approximation approach that makes it easier to compute 
for large alphabet sizes. The resulting comparison is promising 
and shows that much can be gained in operating in the frequency 
domain on truncated messages. The study however is limited to 
the direct transform and stops short of treating the more diffi¬ 
cult question of how to efficiently evaluate the inverse transform 
when one is only interested in its q' most significant output val¬ 
ues. 

B. LDPC codes over rings and Analog Digital Belief Propaga¬ 
tion (ADBP) 

Consider the problem of designing a high spectral efficient 
transmission system making use of an encoder of rate Tc and a 
high order g-PAM constellation, yielding a spectral efficiency 
77 = rclog 2 (g) [bits/dimension]. 

The current state-of the art solution, adopted in most stan¬ 
dards, is the pragmatic approach of Pigure[^(A). A binary en¬ 
coder is paired to a g-PAM modulation using an interleaver 
and a proper mapping that produces a sequence of constella¬ 
tion points. At the receiver a detector computes binary Log- 
Likelihood Ratios from symbol LLRs and passes them to the 
binary iterative decoder through a suitably designed interleaver. 
The complexity of the LLR computation is linear with g and 
consequently exponential with the spectral efficiency 77. 

The feed-forward receiver scheme is associated to a “prag¬ 
matic” capacity that is smaller than that of the modulation set 
and can be maximized using Gray mapping. The feedback struc¬ 
ture (dashed lines in Pigure[^(A)) can recover this capacity loss 
if coupled with a proper binary code design. However, iterating 
between detector and decoder increases the receiver complex¬ 
ity as the conversion from bit to symbol LLRs and viceversa is 
included in the loop, so that its complexity is multiplied by the 
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number of detector iterations. 

A straightforward extension of an {N,K) binary encoder is 
obtained by substituting the binary quantities at the input of the 
encoder with q-ary symbols. Parity-check symbols are obtained 
by performing modg sums instead of mod 2 sums in the en¬ 
coding procedure. The set of codewords is then defined as fol¬ 
lows: 

C = {c e Z" : He = 0}, 

where the matrix elements are constrained to take only value in 
{0, ±1}. The asymptotic properties of this class of codes were 
studied in | |43] and 0, where they were named “modulo-q" 
or quantized coset (MQC) codes. Both papers showed that they 
achieve the random coding exponent and thus are capable of 
achieving capacity. 

The q-ary output symbols c from the encoder can then be di¬ 
rectly mapped to g-PAM constellations. At the receiver (Fig¬ 
ure |^(B)) the use of the regular non binary BP iterative decod¬ 
ing algorithm requires to compute the Log-Likelihood ratios of 
the transmitted symbols in the form of g — 1-ary vectors. For 
AWGN the LLRs take the following form 

^(c) = -^[\y- -\y- a:(co)P] Vc ^ Cq 

where Kn — 1 /cr^ is the concentration of the noise. 

A straightforward implementation of non binary BP results in 
memory and complexity requirements of the order of 0 {q) and 
0{q^) respectively. In order to reduce the complexity of non 
binary decoding, several decoding schemes have been proposed 
in recent years. These were discussed in the previous section 
and we summarize them again here. 

The first straightforward simplification is obtained at check 
nodes by replacing the discrete convolution of messages, hav¬ 
ing complexity O(g^), with the product of the message Fourier 
transforms. The use of FFT brings down the complexity to 
O(glogg). In | |43| , the authors introduce a log-domain version 
of this approach that has advantages in terms of numerical sta¬ 
bility. 

Some further simplifications have been proposed in p8) with 
the Extended Min Sum (EMS) algorithm, where message vec¬ 
tors are reduced in size by keeping only those elements in the 
alphabet with higher reliability. In | |46) , p9) the same authors 
propose a hardware implementation of the EMS decoding algo¬ 
rithm for non-binary LDPC codes. 

In pT) the Min-Max algorithm is introduced with a reduced 
complexity architecture called selective implementation, which 
can reduce by a factor 4 the operations required at the check 
nodes; however, complexity is still in the order of O(g^). 

Several studies on VLSI implementation of non binary de¬ 
coders based on the previous algorithms have been presented in 
literature (48), (^, ||^, (^, (^. The results of 

such studies confirm that all non binary decoders require com¬ 
plexity growing with the size of the alphabet. 

The analog digital belief propagation (ADBP) algorithm pro¬ 
posed in ED represents a breakthrough in the reduction of the 
complexity and memory requirements with respect to previous 
proposed algorithms, as for ADBP both complexity and memory 
requirements are independent of the size g of the alphabet. The 


main simplification of ADBP is due to the fact that messages are 
not stored as vector of size g containing the likelihood of the dis¬ 
crete variables (or equivalently their log-likelihood ratios-LLR) 
but rather as the two moments, or related quantities, of some 
suitable predefined class of Gaussian-like distributions. ADBP 
can be cast into the general class of expectation-propagation al¬ 
gorithms described by Minka p5j . The main contribution in 
ED is the definition of the suitable class of distributions for the 
messages relative to wrapped and discretized variables and the 
derivation of the updating equations for the message parameters 
at the sum and repetition operations of the Tanner graph. 

A receiver system using the Analog Digital Belief Propaga¬ 
tion (Eigure [^(C)), takes then as input messages directly the 
pair (AT, y) of the concentration of the noise and the received 
samples. This pair identifies a member of the predefined class 
of Gaussian-like likelihoods and ADBP performs the BP updat¬ 
ing by constraining the messages in the graph to stay in the same 
distribution class. 

The exact ADBP updating equations however are not suit¬ 
able for a straightforward implementation due to the presence 
of complex non linear operations. Some simplifications to the 
updating equations have been presented in | |56) . In (57) the prac¬ 
tical feasibility of ADBP decoding is proved and post synthesis 
results of the hardware implementation of required processing 
functions are provided. 

The ADBP decoder cannot be applied to all types of linear 
codes over GF{q) as multiplication by field elements different 
from ±1 is not allowed in the graph. This constraint has not been 
taken into consideration previously at the code design stage and 
requires the construction of new and efficient codes. Although 
| |43| and (44) show that asymptotically this class of codes can 
achieve capacity, in literature there are no example of good code 
constructions with finite size. 

The exceptional complexity reduction achieved from using 
the ADBP, together with the asymptotic results motivates for 
further research effort in the design of good LDPC encoders 
within this class. 

IV. Polar Codes 

Since its inception, the major challenge in coding theory has 
been to find methods that would achieve Shannon limits using 
low-complexity methods for code construction, encoding, and 
decoding. A solution to this problem has been proposed in (14) 
through a method called “channel polarization.” Rather than at¬ 
tacking the coding problem directly, the polarization approach 
follows a purely information-theoretic route whereby N inde¬ 
pendent identical copies of a given binary-input channel W are 
manipulated by certain combining and splitting operations to 
“manufacture” a second set of binary-input channels 
that have capacities either near 0 or near 1, except for a frac¬ 
tion that vanishes as N becomes large. Once such polarized 
channels are obtained, “polar coding” consists of transmitting 
information at full rate over that are near perfect and fix¬ 
ing the inputs of the remaining channels, say, to zero. In p4) , 
it was shown that polar codes contracted in this manner could 
achieve capacity with encoding and decoding methods of com¬ 
plexity 0{N log N). In subsequent work (58) , it was shown 
that the probability of frame error for polar codes goes to zero 
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roughly as e~^ for any fixed rate below capacity; this result 
was later refined by p9) who determined the explicit form of 
the dependence of the exponent on the code rate. 

The basic binary polar code is a linear code defined for any 
block length iV = 2” in terms of a generator matrix 


Gn =F® 



0 

1 ’ 


( 1 ) 


where F®™ denotes the nth Kronecker power of F. In polar 
coding one encodes a data word u = (ui,..., un) into a code¬ 
word X = (xi,..., xn) through the transformation x = uGn- 
For a rate K/N polar code, one fixes N — K of the coordi¬ 
nates of u to zero, effectively reducing Gn to a K x N matrix. 
For example, for a {N,K) = (8,4) polar code, one may fix 
ui,U 2 ,U 3 , Us to zero and obtain from 


Gs = 


1 0 0 0 0 0 0 
1 1 0 0 0 0 0 
1 0 1 0 0 0 0 
11110 0 0 
1 0 0 0 1 0 0 
110 0 110 
10 10 10 1 
1111111 


the 4x8 generator matrix 


G 4.8 — 


11110 0 0 
110 0 110 
10 10 10 1 
1111111 


0 

0 

0 

0 

0 

0 

0 

1 


0 

0 

0 

1 


The polar code design problem consists in determining which 
set of {N — K) coordinates to freeze so as to achieve the best 
possible performance under SC decoding on a given channel. It 
turns out that the solution to this problem depends on the chan¬ 
nel at hand, so in general there is no universal set of coordinates 
that are guaranteed to work well for all channels of a given ca¬ 
pacity. In p4) , only a heuristic method was given for the polar 
code design problem. The papers jsg, i6g,i|6g provided a full 
solution with complexity 0{N). With this development, po¬ 
lar codes became the first provably capacity-achieving class of 
codes with polynomial-time algorithms for code construction, 
encoding, and decoding. 

Other important early theoretical contributions came in rapid 
succession from |63|, gg, m, leg . Polar coding was 
extended to non-binary alphabets in ]68|7|^, | |70[ , | |7T| . Polar 
code designs by using alternative generator matrices with the 
goal of improving the code performance were studied in fT^ , 

IZl), (H)’Cl’IZH- 


As stated above, polar coding is a channel dependent design. 
The performance of polar code under “channel mismatch” (i.e., 
using a polar code optimized for one channel on a different one) 
has been studied by fTT) , who showed that there would be a 
rate loss. As shown in |78|, the non-universality of polar codes 
is a property of the suboptimal low-complexity successive can¬ 
cellation decoding algorithm; under ML decoding, polar codes 
are universal. More precisely, |78) shows that a polar code op¬ 
timized for a Binary Symmetric Channel (BSC) achieves the 


capacity of any other binary-input channel of the same capac¬ 
ity under ML decoding. This result is very interesting theoreti¬ 
cally since it gives a constructive universal code for all binary- 
input channels; however, it does this at the expense of giving 
up the 0{N log N) decoding algorithm. In more recent work 
ig, universal polar coding schemes have been described, 
which come at the expense of lengthening the regular polar code 
construction. 

It was recognized from the beginning that the finite length 
performance of polar codes was not competitive with the state- 
of-the-art. This was in part due to the suboptimal nature of the 
standard successive cancellation (SC) decoding algorithm, and 
in part due to the relatively weak minimum distance properties 
of these codes. Another negative point was that the SC decoder 
made its decisions sequentially, which meant that the decoder 
latency would grow at least linearly with the code length, which 
resulted in a throughput bottleneck. Despite these shortcom¬ 
ings, interest in polar codes for potential applications contin¬ 
ued. The reason for this continued interest may be attributed 
to several factors. First, polar codes are firmly rooted in sound 
well-understood theoretical principles. Second, while the per¬ 
formance of the basic polar code is not competitive with the 
state-of-the-art at short to practical block length, they are still 
good enough to maintain hope that with enhancements they can 
become a viable alternative. This is not surprising given that po¬ 
lar codes are close cousins of Reed-Muller codes, which are still 
an important family of codes HD in many respects, including 
performance. Third, polar codes have the unique property that 
their code rate can be adjusted from 0 to 1 without changing the 
encoder and decoder. Fourth, polar codes have a recursive struc¬ 
ture, based on Plotkin’s |u|u-|-u| construction 1821, which makes 
them highly suitable for implementation in hardware. For these 
and other reasons, there have been a great number of proposals 
in the last few years to improve the performance of polar codes 
while retaining their attractive properties. The proposed meth¬ 
ods may be classified essentially into two categories as encoder- 
side and decoder-side techniques. 

Among the encoder-side techniques, one may count the non¬ 
binary polar codes and binary polar codes starting with a larger 
base matrix (kernel); however, these techniques have not yet at¬ 
tracted much attention from a practical viewpoint due to their 
complexity. Other encoder side techniques that have been tried 
include the usual concatenation schemes with Reed-Solomon 
codes Isg , and other concatenation schemes | [84l , | [85| , fS^ . 

Two decoder-side techniques that have been tried early on to 
improve polar code performance are belief propagation (BP) de¬ 
coding and trellis-based ML decoding | |88| . The BP de¬ 
coder did not improve the SC decoder performance by any sig¬ 
nificant amount; however, it continues to be of interest since 
the BP decoder has the potential to achieve higher throughputs 
compared to SC decoding | |89) . 

The most notable improvement in polar coding performance 
came by using a list decoder | [90) with CRC, which achieved 
near ML performance with complexity roughly 0{LN log N) 
for a list size L and code length N. The CRC helps in two ways. 
First, it increases the code minimum distance at relatively small 
cost in terms of coding efficiency, thus improving code perfor¬ 
mance especially at high SNR. Second, the CRC helps select the 
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Fig. 9. Performance comparison of polar and LDPC codes. 


correct codeword from the set of candidate codewords offered 
by the list decoder. It should be mentioned that the above list 
decoding algorithm for polar codes was an adaptation of an ear¬ 
lier similar algorithm given in HD in the context of RM codes. 
The vast literature on RM codes continues to be a rich source 
of ideas in terms of design of efficient decoding techniques for 
polar codes. A survey of RM codes from the perspective of de¬ 
coders for polar codes has been given in | |92) . 

We end this survey by giving a performance result for polar 
codes. Figure 1^ compares the performance of a (2048,1008) 
polar code with the WiMAX (2304,1152) LDPC code. The po¬ 
lar code is obtained from a (2048,1024) code by inserting a 
16-bit CRC into the data and is decoded by a list-of-32 decoder. 
The LDPC code results are from the database provided by [9^ ; 
decoding is by belief propagation with maximum number of it¬ 
erations limited to 30 and 100 in the results presented. The real¬ 
ization that polar coding performance can rival the state-of-the- 
art has spurred intense research for practical implementations 
of these codes. We omit from this survey the implementation- 
oriented papers since that is already a very large topic by itself. 
Whether polar codes will ever appear as part of the portfolio of 
solutions in future systems remains uncertain. The state-of-the- 
art in error correction coding is mature, with a firm footprint by 
turbo and LDPC codes. Whether polar codes offer significant 
advantages to make room for themselves in practical applica¬ 
tions depends in large part on further innovation on the subject. 

V. Conclusion 

We have presented three areas of active research in coding 
theory. We introduced spatially coupled LDPC codes for which 
the asymptotic performance of the iterative decoder is improved 
to that of the optimal decoder. We have discussed non-binary 
LDPC codes and have introduced a new decoding algorithm, 
analog digital belief propagation (ADBP), whose complexity 
does not increase with the alphabet size. Finally, we have de¬ 
scribed polar coding, a novel code construction based on a phe¬ 
nomenon coined channel polarization, which can be proved the¬ 
oretically to achieve channel capacity. We have stated a number 


of open problems, among them: 

• When decoding non-binary LDPC codes in the frequency do¬ 
main, can we design a reduced complexity inverse transform if 
we are only interested in the larger components of the resulting 
distribution-valued message? 

• How do we design LDPC codes over rings of integers to opti¬ 
mize the performance of the ADBP decoder? 

• While the potential of polar codes is established and proven, 
how can we improve the performance of its low complexity 
sub-optimal decoders at moderate codeword lengths in order for 
them to rival the performance of LDPC and turbo codes in prac¬ 
tice? Can the performance of belief propagation be improved 
in this context, or are there perhaps brand-new decoding ap¬ 
proaches that could solve this dilemna? 

We hope to have shown in this paper that coding theory is an 
active area of research with many challenges remaining and a 
number of promising innovations on their way to maturing into 
technological advances in the coming years. 
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